36 research outputs found
Do Language Models Understand Anything? On the Ability of LSTMs to Understand Negative Polarity Items
In this paper, we attempt to link the inner workings of a neural language
model to linguistic theory, focusing on a complex phenomenon well discussed in
formal linguis- tics: (negative) polarity items. We briefly discuss the leading
hypotheses about the licensing contexts that allow negative polarity items and
evaluate to what extent a neural language model has the ability to correctly
process a subset of such constructions. We show that the model finds a relation
between the licensing context and the negative polarity item and appears to be
aware of the scope of this context, which we extract from a parse tree of the
sentence. With this research, we hope to pave the way for other studies linking
formal linguistics to deep learning.Comment: Accepted to the EMNLP workshop "Analyzing and interpreting neural
networks for NLP
The Fast and the Flexible: training neural networks to learn to follow instructions from small data
Learning to follow human instructions is a long-pursued goal in artificial
intelligence. The task becomes particularly challenging if no prior knowledge
of the employed language is assumed while relying only on a handful of examples
to learn from. Work in the past has relied on hand-coded components or manually
engineered features to provide strong inductive biases that make learning in
such situations possible. In contrast, here we seek to establish whether this
knowledge can be acquired automatically by a neural network system through a
two phase training procedure: A (slow) offline learning stage where the network
learns about the general structure of the task and a (fast) online adaptation
phase where the network learns the language of a new given speaker. Controlled
experiments show that when the network is exposed to familiar instructions but
containing novel words, the model adapts very efficiently to the new
vocabulary. Moreover, even for human speakers whose language usage can depart
significantly from our artificial training language, our network can still make
use of its automatically acquired inductive bias to learn to follow
instructions more effectively
Analysing Neural Language Models: Contextual Decomposition Reveals Default Reasoning in Number and Gender Assignment
Extensive research has recently shown that recurrent neural language models
are able to process a wide range of grammatical phenomena. How these models are
able to perform these remarkable feats so well, however, is still an open
question. To gain more insight into what information LSTMs base their decisions
on, we propose a generalisation of Contextual Decomposition (GCD). In
particular, this setup enables us to accurately distil which part of a
prediction stems from semantic heuristics, which part truly emanates from
syntactic cues and which part arise from the model biases themselves instead.
We investigate this technique on tasks pertaining to syntactic agreement and
co-reference resolution and discover that the model strongly relies on a
default reasoning effect to perform these tasks.Comment: To appear at CoNLL201
The Validity of Evaluation Results: Assessing Concurrence Across Compositionality Benchmarks
NLP models have progressed drastically in recent years, according to numerous
datasets proposed to evaluate performance. Questions remain, however, about how
particular dataset design choices may impact the conclusions we draw about
model capabilities. In this work, we investigate this question in the domain of
compositional generalization. We examine the performance of six modeling
approaches across 4 datasets, split according to 8 compositional splitting
strategies, ranking models by 18 compositional generalization splits in total.
Our results show that: i) the datasets, although all designed to evaluate
compositional generalization, rank modeling approaches differently; ii)
datasets generated by humans align better with each other than they with
synthetic datasets, or than synthetic datasets among themselves; iii)
generally, whether datasets are sampled from the same source is more predictive
of the resulting model ranking than whether they maintain the same
interpretation of compositionality; and iv) which lexical items are used in the
data can strongly impact conclusions. Overall, our results demonstrate that
much work remains to be done when it comes to assessing whether popular
evaluation datasets measure what they intend to measure, and suggest that
elucidating more rigorous standards for establishing the validity of evaluation
sets could benefit the field.Comment: CoNLL202
Analysing the potential of seq-to-seq models for incremental interpretation in task-oriented dialogue
We investigate how encoder-decoder models trained on a synthetic dataset of
task-oriented dialogues process disfluencies, such as hesitations and
self-corrections. We find that, contrary to earlier results, disfluencies have
very little impact on the task success of seq-to-seq models with attention.
Using visualisation and diagnostic classifiers, we analyse the representations
that are incrementally built by the model, and discover that models develop
little to no awareness of the structure of disfluencies. However, adding
disfluencies to the data appears to help the model create clearer
representations overall, as evidenced by the attention patterns the different
models exhibit.Comment: accepted to the EMNLP2018 workshop "Analyzing and interpreting neural
networks for NLP
Transcoding compositionally: using attention to find more generalizable solutions
While sequence-to-sequence models have shown remarkable generalization power
across several natural language tasks, their construct of solutions are argued
to be less compositional than human-like generalization. In this paper, we
present seq2attn, a new architecture that is specifically designed to exploit
attention to find compositional patterns in the input. In seq2attn, the two
standard components of an encoder-decoder model are connected via a transcoder,
that modulates the information flow between them. We show that seq2attn can
successfully generalize, without requiring any additional supervision, on two
tasks which are specifically constructed to challenge the compositional skills
of neural networks. The solutions found by the model are highly interpretable,
allowing easy analysis of both the types of solutions that are found and
potential causes for mistakes. We exploit this opportunity to introduce a new
paradigm to test compositionality that studies the extent to which a model
overgeneralizes when confronted with exceptions. We show that seq2attn exhibits
such overgeneralization to a larger degree than a standard sequence-to-sequence
model.Comment: to appear at BlackboxNLP 2019, AC